AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multi-source data pre-training

# Multi-source data pre-training

Minueza 32M Base
Apache-2.0
Minueza-32M-Base is a base model with 32 million parameters, fully trained on extensive English text corpora, suitable for text generation tasks.
Large Language Model Transformers English
M
Felladrin
68
18
Arabic T5 Small
Arabic language model trained on T5v1.1 small architecture, incorporating multiple Arabic datasets for training
Large Language Model Arabic
A
flax-community
279
10
Roberta Hindi
RoBERTa model pre-trained on massive Hindi data, supporting masked language modeling tasks
Large Language Model
R
flax-community
212
2
Convbert Base Generator Finnish
Apache-2.0
A Finnish ConvBERT generator model pre-trained with Replaced Token Detection (RTD) objective, specialized for fill-mask tasks.
Large Language Model Transformers Other
C
Finnish-NLP
36
0
Gpt2 Medium Finnish
Apache-2.0
A 345 million parameter GPT-2 model pre-trained on massive Finnish text, excelling in Finnish text generation
Large Language Model Other
G
Finnish-NLP
30
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase